Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF parser: add required attributes from the document context to section #3554

Merged
merged 1 commit into from
Sep 14, 2023

Conversation

buchen
Copy link
Member

@buchen buchen commented Sep 14, 2023

No description provided.

@buchen
Copy link
Member Author

buchen commented Sep 14, 2023

As discussed, this change allows to "require" document context attributes.

It really does not reduce the number of lines of code in the extractor - but it fails with better error messages if a property such as currency is actually missing.

What do you think?
Merge? Or not worth it?

@Nirus2000
Copy link
Member

Hello @buchen
Is it still possible that documentContext could also be optional?
Sometimes it is possible that we need to search and compare multiple attributes.

Regards
Alex

@buchen
Copy link
Member Author

buchen commented Sep 14, 2023

Is it still possible that documentContext could also be optional?

If it is required, we could add a

.optionalDocumentContext("currency")

I was trying to avoid it, because it makes the code harder to understand and maintain.

Why? The attributes of the sections are always required. The flag "optional" is an all-or-nothing flag on the whole section. Now we would add an optional flag no the attributes level - but only on some attributes. From a semantic point of view it becomes kraut and rüben.

Plus: the added value of defining an optional document attributes is very small. For required document attributes we fail if they do not exist - but obviously we do not check this for optional attributes. So it boils down to writing v.get("X") instead of type.getCurrentContext().get("X") which is a difference of only 23 characters.

@Nirus2000
Copy link
Member

Okay, i understand.
Let's start...

@buchen buchen merged commit a7edb71 into master Sep 14, 2023
@buchen buchen deleted the feature_mixin_document_properties branch September 14, 2023 17:29
buchen pushed a commit that referenced this pull request Sep 25, 2023
…e document context

Refactoring of PDF importers after adding required attributes from the document context (#3554 + #3556)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants